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The characterization of large-scale structural organization of social networks is an important interdisciplinary 
problem. We show, by using scaling analysis and numerical computation, that the following factors are relevant 
for models of social networks: the correlation between friendship ties among people and the position of their 
social groups, as well as the correlation between the positions of different social groups to which a person 
belongs. 
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Application of concepts and tools from physics to the un- 
derstanding of large-scale structural organization of social 
networks is an interesting interdisciplinary topic. This is par- 
ticularly so when considering that a social network is typically 
a complex network 1 1 ] that possesses the small-world property 
121. There is now a large recent literature concerning complex 
networks, for which ideas and methodologies from statistical 
and nonlinear physics have proven to be useful 1 1, 2]. The 
purpose of this Letter is to present a quantitative analysis elu- 
cidating some fundamental ingredients required for models of 
complex, social networks. 

The problem that motivates our analysis is the small-world 
phenomenon, according to which any two people are con- 
nected by a short chain of acquaintances 0, 0, |5| . Although 
sociological in origin, the small-world phenomenon has been 
observed in a variety of natural and man-made systems 1 1, 2], 
with examples ranging from word association 0] to the In- 
ternet 1 7]. The existence of short paths in these systems has 
been successfully described by network models with some de- 
gree of randomness Isll^ floll . However, since short paths are 
present in most random networks, it is not clear which models 
are sociologically more plausible, and the real structure of the 
network of social ties still remains widely unknown. 

A more involved and entirely different issue concerns the 
discovery of short paths based o nly on local information, such 
as in a process of target-search Illl[l2l[l^[l^[l5lfl611 . which 
has been only partially understood. In particular, the phe- 
nomenon of quick and easy identification of acquaintances has 
not been explained yet at a fundamental level. When two peo- 
ple are introduced to each other, they are naturally inclined 
to look for social connections that can identify them with the 
newly introduced person. In this process, they often discover 
that they share common friends, that their friends live or work 
in the same place, etc. Considering the typically large size of 
the communities and the limited number of acquaintances a 
person has, this happens with a surprisingly high probability, 
even if we accept that people systematically underestimate the 
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likelihood of coincidences. The often successful identifica- 
tion of acquaintances is even more striking in view of the very 
small number of friends usually mentioned in an introductory 
conversation. As we show, the existence of short paths con- 
necting people, although to some extent necessary, is not a 
sufficient condition for the frequent identification of common 
friends to occur, even when we consider that strangers who 
meet are more likely to have mutual friends than randomly 
selected people. Indeed, the networks that account for this 
phenomenon contain both random and regular components 
and are necessarily highly correlated (to be described below). 
This result constrains the possible structure of the actual net- 
work of acquaintances and provides insight into the properties 
of social networks. These properties are potentially relevant 
to a variety of other networks as well. 

A class of social network models has been recently pro- 
posed by Watts, Dodds, and Newman (WDN) 1 13], which can 
explain the letter-sending experiment of Travers and Milgram 
0. In this model, people are organized into groups accord- 
ing to their social characteristics. These groups in turn belong 
to groups of groups and so on, forming a hierarchy of so- 
cial structure. A different hierarchical scheme is defined for 
each social characteristic O, which is assumed in the WDN- 
model to be completely independent of one another. The net- 
work is then constructed using the notion of social distance 
defined in terms of this set of hierarchies. However, social 
groups are often correlated. For example, people who work 
or study together are more likely to engage in other activities 
together. As we show, a proper level of correlation among 
social groups is the key to discovering social connections be- 
tween individuals. 

Network Model - We consider a community of N people, 
which represents for instance the population of a city. People 
in this community are assumed to have H relevant social char- 
acteristics that may correspond to professional or private life 
attributes. Each of these characteristics defines a nested hi- 
erarchical organization of groups, where people are split into 
smaller and smaller subgroups downwards in this nested struc- 
ture (see Fig. ^). Such a hierarchy is characterized by the 
number I of levels, the branching ratio b at each level, and 
the average number g of people in the lowest groups. Realis- 
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FIG. 1 : Model of social network, (a) People (dots) belong to groups 
(ellipses), which in turn belong to groups of groups and so on. The 
largest group corresponds to the entire community. As we go down 
in this hierarchical organization, each group represents a set of peo- 
ple with increasing social affinity. In the example, there are I — 3 
hierarchical levels, each representing a subdivision in b — 3 smaller 
groups, and the lowest groups are composed of g — 11 people, on 
average. This defines a social hierarchy. The distance between the 
highlighted individuals i and j in this hierarchy is 3. (b) Each hierar- 
chy can be represented as a tree-like structure. Different hierarchies 
are correlated, in the sense that distances that are short along one of 
them are more likely to be short along the others as well. The figure 
shows an example with H — 2 hierarchies, where highlighted in the 
second hierarchy are those people belonging to group A in the first 
one. (c) Pairs of people at shorter social distances are more likely to 
be linked by social ties, which can represent either friendship or ac- 
quaintanceship ties (we do not distinguished them here because the 
ones that are relevant for the problem in question may depend on 
the social context). The figure shows, for a person in the network, 
the distribution of acquaintances at social distance D = 1, 2, and 3, 
where D is the minimum over the distances along all the hierarchies. 



tic values of the parameter g are on the order of tens or hun- 
dreds and represent the average size of typical social groups, 
such as groups of classmates or co-workers. The set of groups 
to which a person belongs defines his or her social coordi- 
nates, so that the social coordinates of person i are the po- 
sitions (xj, ...,xf) that this person occupies in the different 
hierarchies. Given a hierarchy h, a distance d(x^,Xj) along 
h is defined for each pair of people as the lowest level 
(counting from the bottom) at which i and j are found in the 
same group (see Fig. [^). There is one such distance for each 
of the H hierarchies. 



To be concrete, we consider a network dominated by only 
two hierarchies 1 19] (generalization to higher dimensions is 
straightforward). The correlation between social groups is in- 
corporated in the position a person has in each hierarchy. The 
first hierarchy is constructed by assigning people randomly to 
the lowest groups. The second hierarchy is generated from 
the first by shuffling the position of each person according 
to a given distribution, which we assume to be exponential. 
Namely, each person is reassigned to a new position at dis- 
tance y G {1, 2, ...1} from the original position with probabil- 
ity Pp(y) = Bexp(-/3y), where B^ 1 = J2k=i exp(-/3fc), 
so that the constant (3 characterizes the correlation between 
social groups. For (3 > — In 6, people who are close along one 
hierarchy are more likely to be close along the other hierarchy 
as well, as shown in Fig. ]l]p. In the limit (3 3> — In b, both hi- 
erarchies become identical and the model reduces to the case 
where H = 1. The WDN-model corresponds approximately 
to the uncorrected case where /3 « — In b. 

While the social groups do not represent actual social ties, 
the probability of having a link between two people depends 
on the social distance between them 10. This can be mod- 
eled by choosing a person i and a hierarchy h at random and 
linking this person to another person j at a distance x = 
d(x^,Xj) along h with probability P a (x) = Acxp(-ax), 

where A^ 1 = Y^k=i exp(— a/c), and the correlation parame- 
ter a is a measure of social affinity between acquaintances. 
This process is repeated until the average number of links 
per person is n, so that n represents the average number of 
acquaintances a person has. The distance between acquain- 
tances will be the shortest for a ^> — lab, and typically much 
larger for a rj — In b due to the uniform distribution of ties. 
Random networks are then produced when a « — In b, while 
regular networks are produced only when a and (3 are both 
large. A realistic social network is expected to fall some- 
where in the wide region in between these two extremes, as 
illustrated in Fig. QJ. In this region, the networks exhibit 
properties of small- world networks [8], which have been used 
to describe different kinds of social collaboration networks 

Identification of Acquaintances - We assume that a person 
knows another person when he or she knows the social co- 
ordinates of the other. When two people are introduced to 
each other, the information they are likely to exchange first 
is that defining their social coordinates. Next, they exchange 
information about their social connections, by mentioning the 
social coordinates of their acquaintances. Our goal here is 
to compute the probability that the newly introduced people 
find themselves linked to each other through a short chain of 
friendship or acquaintanceship ties. 

Our model of the process of introduction of two people 
starts with each stranger informing the other his or her social 
coordinates. Then, at each time step, (1) one stranger cites 
the social coordinates of an acquaintance closest to the other 
stranger (but not cited yet) with respect to the minimum of the 
distances over all the hierarchies: D(i,j) = min^ d(x^, Xj); 
and (2) the other stranger recognizes if the cited person is a 
mutual acquaintance or an acquaintance within social distance 
D = 1 of some of his or her acquaintances. The two strangers 
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FIG. 2: Identification of acquaintances, (a) Probability that two ran- 
domly chosen people have common acquaintances (circles), acquain- 
tances in the same lowest group (squares), and acquaintances who 
know each other (stars). Inset: blow-up of the probability of having 
common acquaintances, (b) Average number of steps two strangers 
need to find a common acquaintance, given that it exists, (c) Prob- 
ability that randomly chosen strangers find common acquaintances 
(circles), acquaintances in the same lowest group (squares), and ac- 
quaintances in the same lowest group who know each other (stars), 
in up to m = 1, 2, and 20 steps (from bottom to top). Inset: blow-up 
of the probability of finding common acquaintances, (d) Probabil- 
ity that two people in the same lowest group know each other. In 
the computations shown, we set /3 = a, but similar results were ob- 
served for any path in the a/3-plane interpolating from random to 
regular networks. The other parameters are iV = 10 , n = 250, 
g — 100, b — 10, and H = 2, which makes I = 5. The size N 
of the networks is typical for the population of a large metropolitan 
city, and the average number of acquaintances n is consistent with 
empirical values 1231 . 



then repeat (1) and (2) switching their roles at every time step, 
until the identification in (2) succeeds or they run out of ac- 
quaintances to cite. 

The probability that two randomly chosen people have 
common acquaintances, acquaintances at social distance 1 
(i.e., in the same lowest group), or acquaintances who know 
each other, decreases to very small values as the network is 
made more and more regular, as shown in Fig. [2^. This 
happens because in a regular configuration, most of the so- 
cial ties connect people at short distances, and hence the ac- 
quaintances of two people will overlap only if they are socially 
close, which is unlikely to be the case for pairs of randomly 
chosen people in the community. For a random configura- 
tion, on the other hand, there is a non-negligible probability 
of overlap for any two people because their acquaintances are 
uniformly distributed over the entire network. One might then 
be tempted to think that the quick discovery of common ac- 
quaintances is due to the randomness of the network. This, 
however, is far from being the case, as shown below. 

In Fig. [2j) we display the average number of steps needed 



for randomly chosen strangers to find a common acquain- 
tance, given that it exists. In contrast to Fig. [5^, the number 
of steps increases sharply as the randomness of the network 
is made larger, which means that it is extremely difficult to 
identify common acquaintances in random networks. Indeed, 
while in the regular regime only a few steps are required on 
average, in the random regime it requires well over a hundred 
steps. This happens because, in the random limit, the social 
coordinates of a person are completely uncorrelated with his 
or her social ties, and hence do not give any clue for the po- 
sition of the person's acquaintances. Accordingly, since only 
a few among n acquaintances are typically shared with the 
other person, they need to go through many steps to identify 
the overlap. When there is a single common acquaintance, the 
average number of steps approaches n, which is on the order 
of hundreds. Therefore, the probability that two people have 
common acquaintances is larger for random networks, but if 
common acquaintances exist it is easier for these people to 
find them when the underlying network is regular. 

Gathering all these together, we have that the identification 
of acquaintances is most probable in between these two ex- 
tremes, which is verified in Fig. |2j;. In this figure, we dis- 
play the probability that two randomly chosen people identify 
a common acquaintance or acquaintances in the same lowest 
group in m or less steps. For small m, these probabilities are 
small in the regular and random regimes, but they are signif- 
icantly larger for a class of networks within the small-world 
region. This result expresses a trade-off between the overlaps 
and the clues for people to find the overlaps based only on 
local information |22j]. 

In addition, our model justifies a tacit assumption people 
make about the structure of the social network. When the in- 
troduced people find that they have acquaintances in the same 
social group, they tacitly assume that those two acquaintances 
probably know each other. This probability is much higher 
for regular than for random networks, as shown in Fig. |2jl. 
In fact, in a completely regular network the probability ap- 
proaches 1 as every pair of people at social distance 1 know 
each other, while in the random limit it approaches n/ (N—l), 
which is nearly zero. In Fig. |2j;, we show the corresponding 
probability that, in the process of introduction, the strangers 
identify acquaintances at social distance 1 who actually know 
each other (stars). This probability also presents a pronounced 
maximum in the small-world region, consistent with the intu- 
ition that people belonging to the same group are likely to be 
acquainted. 

We now consider the scaling with the system size N. The 
probability that the identification of acquaintances happens in 

the first step is Pi = Efc=i Efe'=i Q( k ) R(k,k') S(k'), 
where Q(k) is the probability that the strangers are at social 
distance k from each other, R(k, k') is the probability that the 
acquaintance first cited (by the first stranger) is at social dis- 
tance k' from the second stranger, and S(k') is the probability 
that the second stranger recognizes this acquaintance either 
for being his or her own acquaintance or for being in the same 
social group of one of them. Because of the symmetry, the 
probability after 2 steps is P2 = Pi + (1 — Pi) Pi. To be 
specific, consider the case H = 1 for b ^> 1, g ^> 1, n < g, 



4 




FIG. 3: Probability that the identification of acquaintances happens 
in up to m steps as a function of the number N of people in the 
community. The continuous lines correspond to our theory and the 
symbols to the numerical verification. We set m = 2, H = 1, 
n = 19, g = 20, I = 5, and a = 0. The legends are the same as in 
Fig. |2j;. The dotted line is plotted for reference and corresponds to 
P~l/N. 



and strangers randomly chosen in the community. Then we 
have Q(k) w b k ~ l , R(k,k') « [1 - b k '- 2 /A k ] B « - [1 - 
b k '~ 1 /A k ] Bk , and S(k') = Bk> /(gAk>) for common acquain- 
tances, S(k') = Ck' /Aj-i for acquaintances in the same lowest 
group, and S(k') = nP a (l)Ck> /(gA k i) for acquaintances in 
the same group who know each other, where Af. — b k ~ 1 , 
B k = nP a [k), and C k = A k [l - exp(-B k /A k )}. The 
asymptotic behavior of the probabilities Pi and P2 « 2Pi 
is roughly P ~ 1/N, where N = N(b), as shown in Fig. [3] 
for a = 0. The same scaling is observed for any a. Therefore, 
the probabilities do not scale with the diameter of the social 
network, which in the small-world region increases only loga- 
rithmically with N. The rationale behind this result is that the 
probability of identification of common acquaintances is lim- 



ited by the probability that common acquaintances actually 
exist, which for randomly chosen pairs of people decreases 
as 1/N. Incidentally, although the probabilities in Fig. |2j; 
decrease if the number N of people is increased, a sharp max- 
imum in the intermediate region is always observed. 

Conclusions - We have shown that the network of social 
ties must be a small world with high degree of correlation for 
the empirically observed frequent identification of acquain- 
tances to be possible. This sheds new light on the large-scale 
organization of the society, as it imposes constraints for the 
possible structure of the network of acquaintances. These 
constraints give a criterion for plausible models of social net- 
works, which has implications to issues of critical concern 
such as spread of diseases, homeland defense, and propaga- 
tion of influence in economic and political systems, where 
the formation and behavior of social groups play important 
roles. In particular, since the dynamics of many biological 
agents is driven by social contacts, reliable models of social 
networks are essential for efforts to reduce the threat of bi- 
ological pathogens and for making decisions in the case of 
massive biological attacks. Another important conclusion of 
our work is that the probability of finding a short chain of 
acquaintances between two people does not scale with typi- 
cal distances in the underlying network of social ties neither 
with respect to system size nor across different degrees of cor- 
relation. For instance, random networks are usually "smaller" 
than small- world networks, and because of that they are some- 
times called themselves small-world networks. But our work 
shows that a random society would not allow people to find 
easily that "It is a small world!" 
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